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Abstract Global climate models suffer from a persistent shortcoming in their 
simulation of rainfall by producing too much drizzle and too little intense rain. 
This erroneous distribution of rainfall is a result of deficiencies in the represen- 
tation of underlying processes of rainfall formation. In the real world, clouds are 
precursors to rainfall and the distribution of clouds is intimately linked to the 
rainfall over the area. This study examines the model representation of tropical 
rainfall using the cloud regime concept. In observations, these cloud regimes are 
derived from cluster analysis of joint-histograms of cloud properties retrieved from 
passive satellite measurements. With the implementation of satellite simulators, 
comparable cloud regimes can be defined in models. This enables us to contrast 
the rainfall distributions of cloud regimes in 11 CMIP5 models to observations 
and decompose the rainfall errors by cloud regimes. Many models underestimate 
the rainfall from the organized convective cloud regime, which in observation pro- 
vides half of the total rain in the tropics. Furthermore, these rainfall errors are 
relatively independent of the model’s accuracy in representing this cloud regime. 
Error decomposition reveals that the biases are compensated in some models by a 
more frequent occurrence of the cloud regime and most models exhibit substantial 
cancellation of rainfall errors from different regimes and regions. Therefore, un- 
derlying relatively accurate total rainfall in models are significant cancellation of 
rainfall errors from different cloud types and regions. The fact that a good repre- 
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sentation of clouds does not lead to appreciable improvement in rainfall suggests 
a certain disconnect in the cloud-precipitation processes of global climate models. 


Keywords cloud regimes - model evaluation - model rainfall - tropics - CMIP5 - 
CFMIP2 - ISCCP 


1 Introduction 


Accurate projections of rainfall are vital to society in a changing climate for pur- 
poses ranging from monitoring flood hazards to managing water resources. How- 
ever, this is hindered by longstanding errors in global climate models. These errors 
include the persistent underestimation of heavy rain (e.g., Dai 2006; Sun et al 2006; 
Stephens et al 2010), incorrect timing in the diurnal cycle (e.g., Yuan et al 2013), 
and poor simulations of intraseasonal variability (e.g., Lin et al 2006; Jiang et al 
2015). Many of these errors have been variously attributed to deficiencies in the 
representation of subgrid-scale processes such as cloud microphysics (Kang et al 
2015) and deep convection (Folkins et al 2014). 

In the tropics, rainfall is intimately linked to clouds and convection. In partic- 
ular, cloud regimes derived from passive satellite observations of cloud properties 
(Jakob and Tselioudis 2003; Rossow et al 2005) have been used as proxies for 
various convective states to study rainfall (Lee et al 2013; Tan et al 2013; Rossow 
et al 2013). Amongst the key results from these studies are: the existence of an or- 
ganized convective cloud regime that is associated with exceptionally high rainfall 
and contributes to about half the total tropical rainfall despite a relatively low oc- 
currence (~5-10%); the existence of other less organized convective regimes with 
a moderate amount of rainfall; and the generally nonprecipitating nature of the 
majority of cloud regimes. In particular, Rossow et al (2013) raises the question 
of how well global climate models are able to capture extreme rainfall when they 
lack a proper representation of organized convection. 

Given the insights that cloud regimes can provide on precipitation, the goal 
of this study is to use similar cloud regimes to evaluate model rainfall and exam- 
ine the rainfall-cloud relationship within 11 global climate models in the Coupled 
Model Intercomparison Project, Phase 5 (CMIP5) database, with the aim of con- 
tributing to an improvement in projections of tropical rainfall. Cloud regimes are 
routinely used in model evaluation due to the growing implementation of satel- 
lite simulators (Klein and Jakob 1999; Bodas-Salcedo et al 2011). Most of these 
studies focus on the representation of clouds, radiation, and climate sensitivity 
in the models (Williams et al 2005; Williams and Tselioudis 2007; Williams and 
Webb 2009; Tsushima et al 2013; Bodas-Salcedo et al 2014; Mason et al 2015; Jin 
et al 2017a,b), while some examined properties associated with particular weather 
systems or atmospheric phenomena (Gordon et al 2005; Chen and Del Genio 2009; 
Bodas-Salcedo et al 2012). A few studies even used cloud regimes to better un- 
derstand the effects of climate change (Williams and Tselioudis 2007; Tsushima 
et al 2016). However, to our knowledge, there have been no studies that evaluated 
the performance of model rainfall through the lens of cloud regimes. As such, this 
study will be the first to dissect errors in model rainfall through simulated cloud 
properties. We will show a general underestimation of rainfall associated with the 
cloud regime that is associated with organized convection in observations, and 
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Fig. 1 Mean joint-histograms (or centroids) of the six ISCCP cloud regimes. 


this underestimation can be traced to incorrect rain rates associated with cloud 
properties. We will also show that rainfall errors when specific regimes occur are 
compensated by the errors in their frequencies and geographical distributions. 


2 Data 
2.1 Observations 


The International Satellite Cloud Climatology Project (ISCCP) D1 dataset pro- 
vides joint-histograms of cloud occurrences as a function of cloud top pressure 
(CTP) and optical thickness (7), constructed from passive retrievals on board a 
network of geostationary and polar-orbiting satellites (Rossow and Schiffer 1999). 
These 7 CTP-bin x 6 7-bin joint-histograms describe the distribution of cloud 
properties within 280 km x 280 km grids at three-hour intervals during sunlit hours 
from July 1983 to December 2009. Applying a k-means clustering algorithm to 
these joint-histograms objectively categorizes them into different cloud regimes or 
weather states (Jakob and Tselioudis 2003; Rossow et al 2005). We repeated their 
approach over the tropics (+15°N/S) but with daytime-averaged joint-histograms 
(so that the daily time resolution matches that of the model outputs) and ob- 
tained six cloud regimes (Fig. 1), similar to those of Rossow et al (2005). Each 
daytime-averaged joint-histogram is then assigned to the cloud regime with the 
closest matching pattern; specifically, the daytime-averaged joint-histogram is as- 
signed to the cloud regime that has the lowest Euclidean distance between the 
42-dimensional vector from the 7 x 6 bins of the joint-histogram and the same 
vector corresponding to the cloud regime centroid (Fig. 1). The daily cloud regime 
field is then regridded from the native 280 km equal-area grid to a 2.5° equal-angle 
grid using the nearest neighbor technique. The geographical distributions of the 
six cloud regimes are shown in Fig. 2. 

One of the six cloud regimes, CR1, has a mean joint-histogram that describes 
a prevalence of deep convective clouds with widespread stratiform anvil clouds 
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Fig. 2 Geographical distribution of the six ISCCP cloud regimes. 


(Fig. 1). The cloud regime generally resides in regions of frequent, vigorous con- 
vection such as in the Intertropical Convergence Zone and the Tropical Western 
Pacific (Fig. 2). Based on its cloud distribution, geographical location, heating 
profile, cloud radiative effect and large-scale environment, CR1 has been identi- 
fied as intense organized convection in previous studies (Jakob and Schumacher 
2008; Oreopoulos and Rossow 2011; Rossow et al 2013; Stachnik et al 2013; Tan 
et al 2013, 2015). The two other convectively-active cloud regimes, CR2 and CR3, 
describe more isolated modes of convection. Other studies have found that they 
have weaker ascending motions and heating profiles (Stachnik et al 2013; Hand- 
los and Back 2014). From its mean joint-histogram, CR2 has a lower population 
of deep convective clouds and a greater population of cirrus clouds, though its 
geographical distribution is broadly similar to CR1. CR2 has been observed to 
co-vary with CR1, suggesting a potential role in the life cycle of large convective 
systems that is often represented by CR1 (e.g., Tromeur and Rossow 2010; Mekon- 
nen and Rossow 2011; Tan et al 2013). CR3, on the other hand, is characterized by 
a variety of lower and thinner clouds, some of which are consistent with cumulus 
congestus clouds. With a greater frequency of occurrence than CR1 or CR2 (Ta- 
ble 1), CR3 occurs in more regions of the tropics especially over land, though over 
areas of orography these “mid-level” clouds! may be close to the surface and thus 
morphologically different from those over the ocean. CR4, CR5 and CR6 inhabit 
convectively-suppressed environments and represent a thin cirrus regime, trade 
cumulus or fair weather regime, and stratocumulus regime respectively. As we will 


1 Strictly speaking, these are clouds whose tops are located in the mid-level altitudes. 
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Table 1 Frequencies of the six cloud regimes in observations (ISCCP) and in the models, 
based on the 42-dimensional vector approach. 


Dataset CRI CR2 CR3 CR4 CR5 CR6 
ISCCP 0.085 | 0.116 | 0.225 | 0.114 | 0.391 | 0.068 
all 0.104 | 0.020 | 0.190 | 0.245 | 0.372 | 0.069 
bec-csm1-1-m 0.083 | 0.011 | 0.230 | 0.320 | 0.328 | 0.028 
CanAM4 0.142 | 0.026 | 0.107 | 0.211 | 0.453 | 0.061 
CCSM4 0.114 | 0.003 | 0.241 | 0.215 | 0.398 | 0.029 
CNRM-CM5 0.090 | 0.024 | 0.139 | 0.321 | 0.387 | 0.039 
GFDL-CM3 0.121 | 0.019 | 0.103 | 0.471 | 0.204 | 0.083 
GISS-E2-R 0.093 | 0.021 | 0.181 | 0.227 | 0.377 | 0.103 
HadGEM2-A 0.106 | 0.041 | 0.183 | 0.188 | 0.384 | 0.098 
IPSL-CM5B-LR | 0.074 | 0.027 | 0.160 | 0.388 | 0.329 | 0.022 
MIROC5 0.091 | 0.000 | 0.183 | 0.017 | 0.483 | 0.226 
MPI-ESM-LR 0.075 | 0.028 | 0.096 | 0.429 | 0.290 | 0.082 
MRI-CGCM3 0.142 | 0.040 | 0.215 | 0.185 | 0.368 | 0.049 


see in Sec. 3.1, these three cloud regimes are generally nonprecipitating and thus 
excluded from most of the subsequent analyses. 

For an observationally-based dataset of rainfall, we use the rainfall estimates 
from Tropical Rainfall Measuring Mission (TRMM) Multisatellite Precipitation 
Analysis (TMPA; also known as TRMM 3B42) Research product, Version 7 (Huff- 
man et al 2007). TMPA uses the TRMM Precipitation Radar and TRMM Mi- 
crowave Imager to calibrate and combine high quality estimates from passive mi- 
crowave instruments on board low-Earth-orbit satellites. Gaps in data are filled in 
by lower quality estimates from geosynchronous infrared measurements that are 
calibrated against microwave estimates on a monthly basis. Values over land are 
adjusted with gauges using the monthly gridded product from Global Precipita- 
tion Climatology Centre to control for biases arising from long-term drifts. TMPA 
has a resolution of 0.25° at 3-hour intervals, covering up to +50° latitudes with 
data beginning in 1998. The rain rate field is averaged to 2.5° and daily resolution. 
Note that the accumulation period is over the full day instead of just over sunlit 
hours. This choice is to be consistent with the model data (Sec. 2.2), even though 
it will result in a discrepancy between the observation time periods of the cloud 
and precipitation data. Nevertheless, using daytime-averaged rain rates does not 
produce substantially different results than using daily-averaged rain rates (see 
Fig. 5 and Fig. S1). 


2.2 Models 


We utilize the models available in the Coupled Model Intercomparison Project, 
Phase 5 (CMIP5) (Taylor et al 2012). As part of the Cloud Feedback Model In- 
tercomparison Project, Phase 2 (Bony et al 2011), many of these models produce 
ISCCP-like joint-histograms (variable clisccp) using the ISCCP satellite simu- 
lator (Klein and Jakob 1999; Bodas-Salcedo et al 2011). These joint-histograms 
differ from those in ISCCP by having 7 bins in 7 as opposed to 6, which we re- 
solve by summing the first two 7-bins. This yields a lowest 7-bin of 0—0.3, which 
is arguable close to the 0.02-1.27 in ISCCP than the 0.3-1.3 if we were to discard 
the first 7-bin (see Jin et al 2017a for a detailed discussion). We also use precipi- 
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Table 2 CMIP5 models used in this study. All models are run under the AMIP experiment 


setup. We select only one ensemble member from each model. 

No. Model Name lat (°) x lon (°) | Modeling Center 

1 bec-csm1-1-m 1.121 x 1.125 Beijing Climate Center, China Meteoro- 
ogical Administration 

2 CanAM4 2.791 x 2.812 Canadian Centre for Climate Modelling 
and Analysis 

3 CCSM4 0.942 x 1.250 National Center for Atmospheric Re- 
search 

4 CNRM-CM5 1.401 x 1.406 Centre National de Recherches 


Météorologiques / Centre Européen 
de Recherche et Formation Avancée en 
Calcul Scientifique 


5 GFDL-CM3 2.000 x 2.500 NOAA Geophysical Fluid Dynamics 
Laboratory 

6 GISS-E2-R 2.000 x 2.500 NASA Goddard Institute for Space Stud- 
ies 

7 HadGEM2-A 1.250 x 1.875 Met Office Hadley Centre 

8 IPSL-CM5B-LR 1.895 x 3.750 Institut Pierre-Simon Laplace 

9 MIROC5 1.401 x 1.406 Atmosphere and Ocean Research Insti- 


tute (The University of Tokyo), National 
Institute for Environmental Studies, and 
Japan Agency for Marine-Earth Science 
and Technology 


10 MPI-ESM-LR 1.865 x 1.875 Max-Planck-Institut fiir Meteorologie 
(Max Planck Institute for Meteorology) 
11 MRI-CGCM3 1.121 x 1.125 Meteorological Research Institute 


tation flux (variable pr) in the models. Both variables have a daily resolution, but 
precipitation is the average over the full day and the joint-histograms are aver- 
ages over the sunlit hours. We restrict ourselves to the AMIP experiment, which 
uses prescribed sea surface temperatures. After leaving out IPSL-CM5A-LR and 
IPSL-CM5A-MR due to issues with the implementation of the ISCCP simulator 
(J.-L. Dufresne, personal communication, 9th June 2015; see also http://cmip- 
pemdi.lnl.gov/cmip5/errata/cmip5errata.html), we use a total of 11 models in 
this study (Table 2). We select the ensemble member r1iip1 for all models except 
for GISS-E2-R and CCSM4, which only have the variable clisccp for ensemble 
member r6iip1 and r7i1p1 respectively. 


The existence of the ISCCP-like joint-histograms allows us to define cloud 
regimes in the models. Since our goal is to evaluate models against observations, 
we choose to assign model joint-histograms to observed cloud regimes based on 
the Euclidean distance of the vector formed by each model joint-histogram to 
the vector formed by the observed cloud regime centroid. There are two ways 
to construct this vector: (i) we can simply use each of the 7 x 6 bins in the 
joint-histogram to form a 42-dimensional vector; or (ii) we can use a reduced 3- 
dimension vector formed by total cloud cover, mean cloud top pressure and mean 
albedo, which provides greater tolerance to minor errors in histogram binning (e.g. 
Gordon et al 2005; Williams and Webb 2009; Jin et al 2017b). We adopted the 
first approach of the 42-dimensional vectors for the following reason. As we will 
see, CR1 has a rain rate that is significantly higher than other CRs. Consequently, 
it is more important that we are able to capture the statistics of CR1 accurately. 
Table 1 shows that CR1 frequencies across the models are more reasonable in the 


175 


176 


177 


178 


179 


180 


181 


182 


183 


184 


185 


186 


187 


188 


189 


190 


191 


192 


193 


194 


195 


196 


197 


198 


199 


200 


201 


202 


203 


204 


205 


206 


207 


208 


209 


210 


211 


212 


213 


214 


215 


216 


217 


218 


219 


Rainfall Errors in Model Cloud Regimes 7 


42-dimensional vector approach as compared to the 3-dimensional vector approach 
(Table $1), which vastly overestimates the occurrence of CR1 for many models. 
Furthermore, the supposed disadvantage of the 42-dimensional vector approach— 
in which a slightly incorrect pixel placed into a neighboring CTP-7 bin is penalized 
as harshly as any other incorrect bin—is actually not so severe because of the 
tendency for cloud distributions to “clump” in the CTP-7 space. Hence, we chose 
to identify model CRs by matching the 42-dimensional vector from the joint- 
histogram to the observed CR centroids. The mean joint-histograms of the model 
CR1, CR2 and CR3 are shown in Fig. 3, and the anomalies in their geographical 
distributions are shown in Fig. 4. In these figures, “all” refers to the aggregate 
data from all models. The other CRs are ignored because of their negligible effect 
on tropical rainfall (as will be discussed in Sec. 3.1). 

In general, CR1 in the models have a predominance of thick and high clouds, 
just as with observations. However, their mean joint-histograms display a greater 
occurrence of thin cirrus clouds, as well as a tendency in some models (e.g. CCSM4 
and GFDL-CM3) to produce clouds that are too high. The models are on aggregate 
able to reproduce the frequencies of CR1, which may be unexpected given that CR1 
largely represents organized convection in observations, while models are known 
to be lacking in a representation of organized convection in their parametrization 
schemes (e.g., Arakawa 2004). For CR2, models produce clouds that are slightly too 
high and thin but still retaining a strong resemblance to observations. However, all 
models severely underestimate the frequency of CR2 (Table 1). On the other hand, 
the mean joint-histograms of model CR3 are more varied, with a high population 
of thin cirrus clouds and, at the same time, mid-level clouds that are considerable 
thicker. On one hand, it is not surprising that clouds in model CR3 are more 
varied than CR1 and CR2; looking at the centroids that model joint-histograms 
were assigned to Fig. 1, CR3 has the highest population of clouds in the center of 
the joint-histogram, so naturally most mid-level clouds in the models would tend 
to fall into CR3. On the other hand, the mean joint-histograms of model CR3 
have a higher occurrence of thicker mid-level clouds, which may be reflecting a 
potential bias in many models. In any case, this shows that model cloud regimes 
may deviate in their patterns from observed cloud regimes, and this will be a 
subject of investigation in Sec. 3.1. 

Just as with observations, we restrict our analysis to the tropics (+15° lati- 
tude). We perform our analysis over the period of 2001-2008, which provides us 
with a sufficient sample size for our analysis. In examining the geographical dis- 
tributions, the variables need to be on the same grid, so for these analyses we 
interpolate them onto the 2.5° ISCCP grids using the nearest neighbor technique. 
For all other tropics-averaged analyses, we retain the variables in their native grids, 
and select ocean-only points to avoid ISCCP retrieval biases (see Tan et al 2013). 


3 Results 
3.1 Rain rate distributions 
All six observed cloud regimes have different distributions of rain rates (Fig. 5). 


CRI has the highest rain rates, which generally decline as we progress towards 
CR6. In particular, CR1 has a unique rain rate distribution, with strikingly high 
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Fig. 3 Mean joint-histograms of model CR1, CR2 and CR3, derived through the 42- 
dimensional vector approach. “all” refers to the mean of joint-histograms from all models. 
The ISCCP mean joint-histograms are included for ease of comparison. 


rain occurrence (i.e. nonzero rain rates) and rain rate values. This is consistent 
with previous studies (Lee et al 2013; Tan et al 2013, 2015), which found that CR1 
contributes to about half the total rainfall in the tropics. This intense rainfall 
reflects the fact that CR1 represents organized convection, which is associated 
with intense deep convection and large areas of stratiform rain. The next two 
regimes with the highest rainfall are CR2 and CR3. While their rain rates are not 
as high as CR1, they still have significant occurrences of rain. Both have similar 
distributions, which is perhaps due to the fact that both represent less organized 
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2 
a 


Fig. 4 Anomalies in the geographical distribution of the frequencies of CR1, CR2 and CR3 
of the models; i.e. the geographical distribution of model regimes minus the geographical 
distribution of observed regimes (Fig. 2). Note that we chose blue to indicate positive anomaly 
due to its association with wet conditions. 


forms of convection. CR4, CR5, and CR6 are associated with low rain rates and 
have no rain most of the time. These nonprecipitating regimes reside in subsiding 
environments (Tan et al 2013; Handlos and Back 2014) and contribute little to the 
total tropical rainfall despite a comparatively high frequency of occurrence. Due 
to the high impact of CR1 to CR3 on rainfall—which together contribute 85% of 
the total tropical rainfall (Fig. 5)—we will focus on these three convectively-active 
cloud regimes from here on. 

With the observed regime rain rate distribution in mind, we now investigate 
the rain rates from the models. Fig. 6 shows the distributions and statistics from 
observations (left-half of the “violin”) and various models (right-half of the “vio- 
lin”). Note that the left-half of the “violin” (observations) are identical across the 
models and are repeated to facilitate the comparison. The “all” category is based 
on the combined set of values from all models. For CR1 (Fig. 6a), many mod- 
els struggle to reproduce the high rain rates of the observations. Some models, 
such as bec-csm1-1-m, GISS-E2-R. and MRI-CGCM3, have too many incidences 
of no-rain. The distributions of some other models, such as CNRM-CM5, have 
the correct shape but are biased low. However, a few models, such as CCSM4, 
HadGEM2-A and MIROC5, are broadly able to capture both the distribution and 
the statistics of the rain rates in CR1. Interestingly, the high upper-whisker (95th 
percentile) for IPSL-CM5B-LR indicates the presence of more outliers than in ob- 
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p; (mm / day) 


CR1 CR5 CR6 


Fig. 5 “Violin” diagrams (Hintze and Nelson 1998) showing the distributions (gray curves), 
the median (red line), the interquartile range (blue box) and the 5th and 95th percentiles 
(whiskers) of rain rates for each ISCCP cloud regime. Numbers at top show the fractional 
contribution to total tropical rainfall. 


servations, implying the existence of a long tail in the distribution. This can also 
been seen in bcc-csm1-1-m, though in this case the overall low values suppresses 
the magnitude of the outliers. 

For CR2 and CR3 (Figs. 6b and c), the rain rate distributions of the models 
show varying degrees of fidelity. Just as with CR1, some models, such as bec-csm1- 
1-m, underestimate the rain rates. However, unlike for CR1, there are models, 
such as IPSL-CM5B-LR and MPI-ESM-LR, that are significantly biased high. 
As for incidences of no-rain, some models overestimate its occurrence (e.g. MRI- 
CGCMB8 for CR2 and CR3) while some underestimate it (e.g. MIROC5 for CR2). 
However, because of the overestimation and underestimation of various models, 
the combined values for CR2 and CR3 from all models have distributions and 
statistics that resemble observations. 

While it appears that the rain rates of the three cloud regimes in these 11 
climate models are deficient in a myriad of ways, there is the possibility that 
these errors arose because of a poor identification of model cloud regimes. Re- 
call that model regimes are derived by assigning model joint-histograms to the 
closest centroids of observed regimes—where closest is defined by the lowest Eu- 
clidean distance between the 7 x 6 dimensional vectors formed by the bins of the 
joint-histograms and the observed regime centroid—and this assignment and thus 
the model cloud regimes may not perfectly capture the dominant cloud patterns 
within the models. We investigate whether the rain rate errors in Fig. 6 can be 
attributed to a poor identification of model cloud regimes by selecting the 10% of 
joint-histograms that have the lowest Euclidean distances in the assignment pro- 
cess. That is, these model cloud regimes are the subsets that most resemble their 
observational counterpart (Fig. 1). The mean joint-histograms of these subsets 
are shown in Fig. 7. Overlaying the rain rate distributions of these subsets over 
the original model distributions, we can see a mixed response from these “better” 
cloud regimes (Fig. 8). In some cases, such as CR1 in IPSL-CM5B-LR and CR2 
in GFDL-CM8, the subsets possess rain rate distributions that are closer to ob- 
served values. In some cases, such as CR1 in CNRM-CM5 and CR3 in MIROC5, 
the subset has worse distributions. However, in many cases, there is no discernible 
difference between the subset and the entire population of model regimes. This 


280 


281 


282 


283 


284 


285 


286 


287 


288 


289 


290 


291 


Rainfall Errors in Model Cloud Regimes 11 


p2 (mm / day) p, (mm / day) 
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GISS-E2-R 


bcc-csm1-1-m 
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Fig. 6 “Violin” diagrams showing the distributions and statistics of rain rates for CR1, CR2 
and CR3 of each model. The left-half of each “violin” corresponds to observations (Fig. 5) 
while the right-half shows the distribution from the model CR. Note the difference in scale 
between the subplots. See Fig. 5 on interpreting the diagram. 


implies that the accuracy of the rain rate distributions of these three regimes in 
the models is unrelated to how well the models represent clouds. 


3.2 Error decomposition 


For the overall rainfall in the tropics, the rain rates of the cloud regimes in the 
previous section are only part of the picture; they describe only how much rain 
falls when a regime occurs in a grid box. How frequent the regimes occur also affect 
the total rainfall. To obtain a better idea of the contributions from various aspects 
of the regimes, we assume that total rainfall can be expressed as the frequency- 
weighted average of regime rain rates, Po = )°, fi,o X Pio, Where fi,o and pi,o are 
the frequency and rain rate of CRi for the observations (indicated by the subscript 
‘o’) and likewise for the models. We can then decompose the total rainfall error 
AP into, 


AP = 5° (Afi x Dio + fiso X Api + Afi x Api), (1) 
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Fig. 7 Mean joint-histograms from the subset of “best-matching” model CR1, CR2 and CR3, 
based on the 10% of joint-histograms with lowest Euclidean distance in the assignment process. 
The ISCCP mean joint-histograms are included for ease of comparison. 


where A denotes model minus observations. The three components in Eq. (1) 
represents the contribution due to error in the frequency of the regime (Af x p), 
the contribution due to error in the rain rate of the regime (f x Ap), and the 
contribution due to the second-order co-variational error in the frequency and 
rain rate of the regime (Af x Ap). These terms, together with the sum of all three 
terms for each model and regime, are shown in Fig. 9. 

We begin first with the contribution from errors in the rain rates (f x Ap). 
The salient points from the previous section are reflected in Fig. 9: the universal 
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Fig. 8 The distributions in Fig. 6 for observations and the models (gray), overlaid with 
distributions from the 10% subset of samples with lowest Euclidean distances on the right-half 
(green). 


underestimation of CR1 rain rates, as well as positive and negative biases for 
CR2 and CR3 amongst different models with a multimodel mean of near-zero. 
However, as opposed to the “pure” error in rain rates, the contribution to total 
rainfall error is modulated by the frequency of the cloud regime. This is illustrated 
in the case of MIROC5, in which the errors in the median rain rates for CR1 and 
CR2 are of comparable magnitudes (~ 5 mm / day; Fig. 6), but because CR1 
has a higher frequency in MIROC5 than CR2, its contribution to the total error 
is larger. This demonstrates the point that such error decomposition provides a 
more comprehensive perspective of the overall rainfall error than just the rain rate 
distributions. 


The contribution from errors in the frequencies of the cloud regimes (Af x p) is 
therefore strongly influenced by how well the model is able to simulate the occur- 
rence of the regimes. Indeed, there is a close correspondence between Fig. 9 and 
Table 1. In particular, CanAM4, CCSM4, GFDL-CM3, HadGEM2-A and MRI- 
CGCM8B all overestimate the frequency of CR1 by a fairly large amount, resulting 
in a large positive contribution to the total rainfall error. However, in the case 
of CanAM4, CCSM4 and GFDL-CM8, this positive contribution is balanced by 
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Fig. 9 Decomposition of the total rainfall error into the contribution from the error in CR 
frequency (Af x p), the contribution from the error in rain rates (f x Ap), and the cross-term 
(Af x Ap). The sum of all three errors for each model cloud regime is indicated by the gray 
box. 


the negative error from the rain rates. In other words, these three models produce 
CRI too frequently but with too low a rain rate, such that the contribution of er- 
ror from CRI is small. As for CR2 and CR3, consistent with the underestimation 
of their frequencies in nearly all models, the contributions are nearly all negative. 
The magnitudes of the errors are especially high for CR2, as models appear unable 
to produce these cirrus-dominated regime with sufficient frequency. 

Lastly, the cross-terms (Af x Ap) are, with a few exceptions, of lower magni- 
tudes than the other two terms within the same model and CR. This demonstrates 
that the contribution from the second-order co-variation of the errors in frequency 
and rainfall is generally low, though it cannot be ignored in certain cases such as 
CRI in CanAM4 and CR3 in CNRM-CM5. 


3.3 Geographical distribution of CR1 errors 


The error decomposition of the previous section provides a tropics-averaged anal- 
ysis of the contributions from various aspects of the cloud regimes; further insights 
can be gleaned by examining the geographical distributions of these contributions. 
Here, we focus on CR1 due to its strong but diverse impact on tropical rainfall. 
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Since the contribution from errors in frequency has similar geographical distribu- 
tion to that of the anomalies in regime frequencies themselves, we can use Fig. 4 
as an indication of this contribution. The five models (CanAM4, CCSM4, GFDL- 
CM3, HadGEM2-A and MRI-CGCM3) identified in the previous section with a 
strong positive contribution from CR1 show overestimation of its occurrence in 
most areas in the tropics. More intriguingly, for other models, while the magni- 
tudes of errors from CR1 frequency are low, there are considerable geographical 
variations in the errors. In fact, the low error in CR1 frequency in the tropics for 
these models (and the corresponding low contribution to total rainfall error) is not 
so much because the models are able to simulate CRI accurately in all regions, 
but because there are compensating errors between different regions. Averaging 
the contributions from CR1 frequency over the tropics masks the biases in different 
geographical regions. 

As for the geographical distributions of the error contributions from the rain 
rates of CR1, Fig. 10 shows a similar conclusion: a low tropics-averaged error is 
due to cancellations from different regions. Models with smaller rain rates errors 
from CR1 in Fig. 9 (HadGEM2-A, IPSL-CM5B-LR, MIROC5 and MPI-ESM-LR) 
also have substantial area in the tropics with a positive contribution. However, the 
regional biases are different across models. For example, an overestimation in the 
eastern Pacific is only present in HadGEM2-A, whereas an overestimation over 
eastern Indian Ocean is only clear in IPSL-CM5B-LR. In contrast, models such as 
CNRM-CM5 and GISS-E2-R, which have a large negative tropics-averaged error, 
show underestimation of CR1 rain rates in almost all regions of the tropics. 


4 Discussion 
4.1 Connection between clouds and precipitation in models 


In Sec. 3.1, we showed that models have varying degrees of success in simulating 
the rain rates of the three convective cloud regimes. Due to the “fuzzy” nature of 
the regime assignment process in models, it was possible that some model regimes 
may include joint-histograms that do not possess the actual physical characteris- 
tics of their regime membership. Indeed, the mean joint-histograms of the model 
regimes (Fig. 3), while displaying broadly similar patterns to observed regime cen- 
troids, exhibited some deviations upon closer examination (e.g. greater proportion 
of clouds in the top-left thin cirrus bin for CR1). However, when restricting the 
evaluation to the subset of joint-histograms that bear the closest resemblance to 
observed centroids, there is no discernible or consistent improvement in the rain 
rates, leading us to conclude that an improved representation of clouds may not 
yield better rain rate distributions. 

Looking at the difference in the mean joint-histograms between the subset 
(Fig. 7) and entire set (Fig. 3), and comparing it to the observed centroids (Fig. 
1), it is clear, as expected, that restricting the model joint-histograms to the best 
matches shifts the cloud pattern towards the observed. For example, in CR1 of 
IPSL-CM5B-LR, there is an appreciable reduction in the overabundance of thin 
cirrus clouds (top-left bin); in CR1 of CNRM-CM5, the entire cloud distribution is 
shifted from overly-thick clouds towards optically-thinner clouds. Yet, the former 
example has improved, higher rain rates while the latter example has worse, lower 
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Fig. 10 Geographical distribution of the contributions from the error in mean rain rate (f x 
Ap) of CR1 for each model. The average tropical error for each model is shown in the number 
at the bottom-right. Note that we chose blue to indicate positive anomaly due to its association 
with wet conditions. 


rain rates. In fact, a consistent result that emerges from this subset analysis is 
a relationship between the rain rate and the height or thickness of the cloud. If 
a joint-histogram has higher or thicker clouds, it will have stronger rain rates 
regardless of the absolute rainfall associated with clouds of a particular height and 
thickness. Therefore, any improvements to the cloud-precipitation relationship in 
the models will need to address the incorrect rain rates associated with each CTP-r 
bin. 
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4.2 Compensating errors between cloud regimes and regions 


While the rain rate distributions of various cloud regimes are illuminating in terms 
of understanding how well models are able to produce the rainfall associated with 
each weather system, they do not provide the complete picture. As we have shown 
in Sec. 3.2, how often the regimes occur is also a relevant factor in determining 
the total rainfall in the tropics, which is perhaps of greater concern for issues 
such the total water budget in the tropics. Indeed, Fig. 9 illustrates this point 
well: although all models underestimate the rain rates of the intense CR1 (and 
thus suffer a negative error to total rainfall), this is partially offset by a positive 
frequency error in several models. In fact, the error due to underestimating the 
occurrence of CR2 has a greater impact on total rainfall in numerous models (Fig. 
9). It is worth noting that the use of the reduced 3-dimensional vector that is 
common in model evaluation of cloud regimes (e.g. Gordon et al 2005; Williams 
and Webb 2009; Tsushima et al 2013; Jin et al 2017b) does improve the frequency of 
CR2 in the models, though at the expense of an overestimation in CR1 frequency. 
Therefore, both the regime frequency and the regime rain rate must be considered 
for a complete picture of total error in tropical rainfall. 

On top of compensating errors between various regimes as well as between 
regime frequency and rain rate, Sec. 3.3 also identified errors in “well-performing” 
models over different regions in the tropics that cancel out upon spatial averaging. 
As a matter of fact, Figs. 4 and 10 suggest that a low error does not necessarily 
indicate that a model is good at producing the correct regime frequency and rain 
rate, but is often the result of an almost even presence of positive and negative 
biases. The presence of compensating errors may not be surprising, given that 
spatially- and temporally-averaged quantities, such as monthly zonal mean pre- 
cipitation, are probably the main guidance during model development and thus 
likelier to better resemble observations. Hence, analyses involving error decom- 
position and regional breakdown can unmask compensating errors and provide a 
more comprehensive view of the weaknesses in rainfall simulations. 


5 Conclusions 


In this study, we investigated the tropical rainfall in 11 CMIP5 models through 
the lens of cloud regimes. In observations, cloud regimes are categorizations of var- 
ious convective environments based on passive satellite retrievals of cloud proper- 
ties. With the implementation of satellite simulators, we can identify model cloud 
regimes by assigning model clouds to observed regime centroids. We examined 
the rain rate distributions of the three convectively-active cloud regimes in the 
models. We find that many models underestimate the rain rates of CR1, which 
in observations represent organized convection, though a few models were broadly 
able to reproduce the observed rain rate distribution. For CR2 and CR3 which 
represents less organized convective environments, the models have varying per- 
formances with both positive and negative biases such that the ensemble of model 
values appear close to observations. Restricting to cases in which model clouds best 
resemble observed clouds does not necessarily improve the performance, which we 
argue is due to an incorrect rain rate associated with the cloud pixel of each CTP-r 
combination. To attain a more comprehensive view of these errors on total tropi- 
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cal rainfall, we performed an error decomposition that reflects the contribution of 
regime frequency and regime rain rates to total rainfall errors. Models generally 
have a negative contribution from CRI rain rates and CR2 frequency, though in 
some cases, particularly for CR1, there are cancellations in the errors between 
frequency and rain rates. Furthermore, an examination of the geographical dis- 
tributions of the errors revealed that a low tropics-averaged error are primarily a 
result of cancellation between positive and negative errors in different regions. 


ISCCP cloud regimes have been regularly employed for the model evaluation 
of clouds and radiation (e.g. Williams et al 2005; Williams and Webb 2009; Bodas- 
Salcedo et al 2014; Mason et al 2015; Tsushima et al 2016; Jin et al 2017a), but 
this is the first time, to our knowledge, that they have been used to study model 
performance on rainfall. By examining model rainfall through cloud regimes, we 
place a stringent demand on models, as a good performance requires the model to 
represent both clouds and rainfall accurately at the same time. Such an approach 
transcends evaluations based on spatially- and temporally-averaged quantities, 
which are the primary guidance during model development. The benefit is a more 
informative categorization of the errors than a simple, straightforward evaluation 
of model rain rates, revealing weather systems that suffer deficiencies in models 
and existence of compensating errors. 


Precipitation has perennially been a challenge in models. In our study, biases 
such as the underestimation of heavy rain (e.g. Stephens et al 2010) manifest in 
the rain rate distributions of CR1. The fact that rain rates do not improve when 
examined for only the best-matching clouds suggest a possible disconnect between 
the resolved variables, which determines the subgrid column profiles that the satel- 
lite simulator relies on, and the surface rainfall in the models, which is jointly a 
result from large-scale and convective (parametrized) precipitation. In particular, 
the lack of a representation for organized convection is an issue widely known to 
the convection community and model developers (e.g. Arakawa 2004; Moncrieff 
et al 2012; Rossow et al 2013; Houze et al 2015). While there is evidence that 
explicit representations of convection, such as through superparametrization or 
convective-permitting simulations, improve the distribution of rainfall (e.g. Mered- 
ith et al 2015; Kooperman et al 2016; Kendon et al 2017), the majority of models 
currently in use, including all in the CMIP5 repository, still employ conventional 
convective parametrizations. Since convection-resolving global climate models are 
still too computationally demanding in the near future, improved representation 
of tropical rainfall will have to come from the development of a better convective 
parametrization scheme. 
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Fig. S1 Mean joint-histograms of model CR1, CR2 and CR3, derived from the 10% of 
joint-histograms with the lowest Euclidean distance. The ISCCP mean joint-histograms are 
included for ease of comparison. 
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Table S1 Frequencies of the six cloud regimes in observations (ISCCP) and in the models, 


based on the reduced 3-dimensional vector approach. 


Dataset CR1 CR2 CR3 CR4 CR5 CR6 
ISCCP 0.081 | 0.067 | 0.223 | 0.100 | 0.426 | 0.103 
bec-csm1-1-m 0.158 | 0.087 | 0.105 | 0.183 | 0.315 | 0.152 
CanAM4 0.154 | 0.129 | 0.040 | 0.144 | 0.386 | 0.146 
CCSM4 0.211 | 0.063 | 0.058 | 0.146 | 0.427 | 0.096 
CNRM-CM5 0.202 | 0.123 | 0.108 | 0.146 | 0.350 | 0.072 
GFDL-CM3 0.134 | 0.237 | 0.046 | 0.224 | 0.258 | 0.101 
GISS-E2-R 0.108 | 0.097 | 0.215 | 0.138 | 0.246 | 0.195 
HadGEM2-A 0.076 | 0.077 | 0.063 | 0.176 | 0.504 | 0.104 
IPSL-CM5B-LR | 0.071 | 0.186 | 0.026 | 0.225 | 0.401 | 0.091 
MIROC5 0.134 | 0.000 | 0.199 | 0.044 | 0.255 | 0.368 
MPI-ESM-LR 0.083 | 0.198 | 0.048 | 0.239 | 0.266 | 0.166 
MRI-CGCM3 0.152 | 0.129 | 0.058 | 0.139 | 0.433 | 0.090 


